Towards Web Search Engine Scale Data Mining
نویسنده
چکیده
Data mining is one of the most critical driving technologies behind Web search engines. Web search engine scale data mining posts many grand challenges, ranging from efficiency and scalability to diversity and adaptability. In this talk, I will review our recent effort on mining a very large amount of data accumulated in one of the major commercial search engines. Particularly, we tackle the problem of context–aware search and query suggestion by employing statistical models. Moreover, we construct a very large statistical model (millions of states) from a very large amount of data (billions of sessions) by distributed data mining. I will also introduce some of our recent initiatives in Web mining. Copyright c ©2009, Australian Computer Society, Inc. This paper appeared at the Eighth Australasian Data Mining Conference (AusDM 2009), Melbourne, Australia. Conferences in Research and Practice in Information Technology (CRPIT), Vol. 101, Paul J. Kennedy, Kok-Leong Ong and Peter Christen, Ed. Reproduction for academic, not-for profit purposes permitted provided this text is included. Proc. of the 8th Australasian Data Mining Conference (AusDM'09)
منابع مشابه
Query Architecture Expansion in Web Using Fuzzy Multi Domain Ontology
Due to the increasing web, there are many challenges to establish a general framework for data mining and retrieving structured data from the Web. Creating an ontology is a step towards solving this problem. The ontology raises the main entity and the concept of any data in data mining. In this paper, we tried to propose a method for applying the "meaning" of the search system, But the problem ...
متن کاملA Mathematical Approach for Improving the Performance of the Search Engine through Web Content Mining
Internet is a rapid growing technology that contains a vast and rich set of information stored on the web. To retrieve, share and to process all these information from the web, a tool has been created called search engine which plays an essential role for the web users. On searching the information from web servers through search engines, many irrelevant and redundant documents containing the r...
متن کاملProceedings of the 7 th Dutch - Belgian Information Retrieval
This talk will present challenges involved in building a Web search engine and will touch on questions of system and algorithm design, in particular as they involve large scale data processing and data mining.
متن کاملTowards Supporting Exploratory Search over the Arabic Web Content: The Case of ArabXplore
Due to the huge amount of data published on the Web, the Web search process has become more difficult, and it is sometimes hard to get the expected results, especially when the users are less certain about their information needs. Several efforts have been proposed to support exploratory search on the web by using query expansion, faceted search, or supplementary information extracted from exte...
متن کاملWeb Mining: a Comparative Study
Currently, World-Wide Web has developed to a distributed information space with nearly 100 million workstations and several billion pages, which brings the people great trouble in finding needed information although huge amount of information available on webs. The search engine is a very important tool for people to obtain information on Internet, but the low-precision and lowrecall exist wide...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009